Data Cleaning
Data Cleaning is the process of turning the data you have into data that is usable. It is, for the lack of a better term, the fight against entropy in the data domain.
Domain Knowledge
To be successful in the data cleaning domain, understanding the domain is paramount. This is a good chance to team up with domain knowledge experts and exploit their intricate understanding of the business cases.
Approach to Cleaning and Understanding Data
I like this approach, as it has worked well with some of my previous projects
Links
Thoughts
- I always muse back on the quote of Lester Fremon from the first season of The Wire. The "this is the job" quote
- Always write down things that you find intersting, weird or out of place. This is a great place to discuss later with stakeholders.
- Data versus